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A  UNIFIED  APPROACH  TO  ESTIMATION  IN  LINEAR  MODELS 


WITH  FIXED  AND  MIXED  EFFECTS 


By 

C.  Radhakrishna  Rao 


Center  for  Multivariate  Analysis 
University  of  Pittsburgh 
Pittsburgh,  PA  15260 


Abstract 


)  A  unified  approach  is  developed  for  the  estimation  of  unknown  fixed 
parameters  and  prediction  of  random  effects  in  a  mixed  Gauss-Markoff 
linear  model.  It  is  shown  that  both  the  estimators  and  their  mean  square 
errors  can  be  expressed  in  terms  of  the  elements  of  a  g-inverse  of  a 
partitioned  matrix  which  can  be  set  up  in  terms  of  the  matrices  used  in 
expressing  the  model.  No  assumptions  are  made  on  the  ranks  of  the  matrices 
involved.  The  method  is  parallel  to  the  one  developed  by  the  author  in  the 
case  of  the  fixed  effects  Gauss-Markoff  model  using  a  g-inverse  of  a 
partitioned  matrixV(R«a  t9?l,  1972,  1973,  1985).  - 

A  new  concept  of  generalized  normal  equations  is  introduced  for  the 
simultaneous  estimation  of  fixed  parameters,  random  effects  and  random 
error.  All  the  results  are  deduced  from  a  general  lemma  on  an  optimization 
problem.  This  paper  is  self  contained  as  all  the  algebraic  results  used  are 
stated  and  proved.  The  unified  theory  developed  in  an  earlier  paper  (Rao, 

1988)  is  somewhat  simplified.  _ 

V 

Kev  words  and  phrases:  g-inverse,  Gauss-Markoff  model  with  fixed  and 

mixed  effects,  IPM  (inverse  partitioned  matrix)  method,  normal  equations. 

AMS  classification  index:  62J05 


a 

1 

: 


$ 


$ 

& 

I 

J 


9 


1.  INTRODUCTION 

The  Gauss-Markoff  model  with  fixed  and  random  effects,  called  the 
mixed  linear  model,  is  written  in  the  form 


Y=Xp+U£+  6 


(1.1) 


g- 

Rn  : 

R(Z): 

ft 

P(Z): 

£ 

f 

|  • 
hi 

IV 

Zx: 

Z  : 
tr  Z: 
(A:B): 

where  Y  is  an  n-vector  of  observations,  X  is  a  given  nxm  matrix,  (3  is  an  m- 
vector  of  unknown  fixed  parameters,  U  is  a  given  nxp  matrix,  £  is  a  p- 
vector  of  hypothetical  random  variables.  We  make  the  following 
assumptions  on  the  first  and  second  order  moments  of  £  and  6. 

E(£)  =  Ay,  E(£)  =  0,  D(|)  =  T,  D(£)  =  G,  Cov(£,  6)  =  0.  (1.2) 

We  refer  to  the  model  (1.1)  -  (1.2)  as  the  GM(M)  model  where  M  within 
brackets  refers  to  mixed  effects.  The  corresponding  model  with  fixed  effects 
only,  i.e.,  without  the  term  U  £,  will  be  referred  to  as  GM(F)  when  a 
distinction  has  to  be  made  or  simply  as  the  GM  model  as  it  is  usually  known. 

We  develop  a  simple  and  a  unified  appproach  in  the  general  case,  when 
nothing  is  assumed  about  the  ranks  of  the  matrices  involved,  for  the 
estimation  of  the  fixed  parameter  p  and  the  prediction  (or  estimation)  of  the 
hypothetical  variables  £  and  6,  when  the  other  parameters  y,  T  and  G  are 
partly  known  or  completely  known.  First  we  prove  a  few  algebraic  lemmas. 
The  following  notations  are  used. 


n  dimensional  Euclidean  real  vector  space. 

vector  space  spanned  by  the  column  vectors  of  the  matrix  Z. 

rank  of  the  matrix  Z. 

a  matrix  of  maximum  rank  such  that  Z  Z -1  =  0. 
a  g-inverse  of  Z,  i.e.,  a  matrix  satisfying  the  equation  Z  Z'  Z  =  Z. 
sum  of  the  diagonal  elements  of  Z  when  it  is  a  square  matrix, 
the  matrix  obtained  by  adjoining  the  columns  of  the  matrix  B  to 
those  of  A. 


We  need  the  following  results  on  g-inverses. 


v-u»  v  i 

^IHSPJCT, 


,  _Dl3tribi:t  ion/ 

Availability  Ccd&s 
i  Aval]  ani/or 

|Dl  st  I  Sj'sr  j  al 


Lemma  1.  Let  Z'  be  a  g-inverse  as  defined  above.  Then: 

(i)  a  =  Z'  b  is  a  solution  of  the  consistent  equation  Za  =  b.  (1.3) 

(ii)  p(Z'Z)  =  p(Z)  =  tr(Z'Z).  (1.4) 

Proof.  Since  Z  a  =  b  is  consistent,  b  €  R(Z),  i.e.,  b  =  Zc  for  some  c. 

Then  ZZ'b  =  ZZ'Zc  =  Zc  =  b  which  shows  that  Z‘b  is  a  solution. 

Now  ZZ'ZZ"  =  ZZ*,  i.e.,  ZZ*  is  idempotent  so  that  (1.4)  follows. 


Lemma  2.  Let  G  be  an  n.n.d.  (non-negative  definite)  matrix  of  order  nxn 
X  be  an  nxm  matrix  and 


f  G 

X  ^ 

f 

Ci 

C2 

<  X 

0  , 

c3 

-c4 

) 

(1.5) 


be  any  choice  of  g-inverse.  Then: 


(i)  XCiG  =  XC  j  G  =  0,  XC2X  =  X=XC3X.  (1.6) 

(ii)  GC  |  G  Ci  G  =G  Ci  G  C  j  G  =  GCi  G  Ci  G  =  GC  j  G  =  G  Ci  G  (1.7) 

(iii)  X’  Ci  X  =X'  C  j  X  =  0.  (1.8) 

(iv)  trG  Ci  =  p(G:X)  -  p(X).  (1.9) 


Proof  First  we  show  that  the  equation 


Ga+Xb  =  G  X 
X'  a  =  X'p 


(1.10) 

i  i 

is  consistent  for  any  vectors  X  and  (Ji.  Let  (a  :  (3)  be  a  row  vector  such  that 
r  G  X  ^ 

,  X’  0  , 


(a  :p) 


=  0  =>  a'G  +  p’X’  =  0,  a'X  =  0 


v  ^ 


•  ».»  t.tU'Lt'biLisJs: 


v>  v«.  .v ■.'  ■ .  *,vv\ yv  yw -xa* ~>: y 


=>  a'G  a  +  p'X'a  =  o  =  a'G  a  -  o  =>  a  G  =  o. 


The  last  step  follows  since  G  is  an  n.n.d.  matrix.  Then, 


(a'  :  p1) 


G  X 


X’  pi 


=  o  =* 


G  X 


X  '  (JL 


G  X’ 


X  0 


which  establishes  the  consistency  of  (1.10).  In  such  a  case,  using  the  g- 
inverse  (1.5),  we  find  a  solution  of  (1.10). 

a  =  Ci  G  X  +  C2  XV,  bA  =  C3GX-C4  X'\i. 

Substituting  &  fora  in  the  second  equation  of  (1.10)  and  equating  the 
terms  involving  X  and  }i  on  both  sides  we  obtain 


XCiG=o,  XC2X=X. 


(1.11) 


Further,  the  transpose  of  the  g-inverse  in  (1.5)  is  also  a  g-inverse  in  view  of 
the  symmetry  of  the  left  hand  side  matrix  of  (1.5),  and  results  analogous  to 
(1.11)  hold  giving 


X’C  jG=o,  X  C  3  X  =  X 


(1.12) 


which  prove  (1.6).  Again,  substituting  a  and  6  for  a  and  b  in  the  first 
equation  of  (1.5)  and  equating  the  terms  in  X  on  both  sides,  we  have 


G  Cj  G  +  X  C3  G  =  G. 


(1.13) 


Multiplying  (1.13)  by  G  Ci  and  G  C  j  and  using  (1.11)  and  (1.12),  we  get 
the  equalities  in  (1.7). 

It  is  easy  to  see  that 


G  a  +  Xb  =  X  (jl 
X  a  =0 


(1.14) 


VW-Y  -V  JV-Y  .V.VVVAA  A-VY  ^  A  .V  .V  «N  Jv  A  A  .*•  A  .V  .  -  A  A  A  A  A Y  L*v  A  A  A 


is  a  consistent  equation  for  any  |JL ,  so  that  a  =  C  i  X  [JL  is  a  solution. 
Substituting  a  fora  in  the  second  equation  of  (1.14),  we  find  that  X  Cj  X  p 
=  o  V  |i  ^X'C]X=o  =  X'C  j  X,  which  proves  (1.8). 


Now 


G  X 


X'  0 


C1  C2 


c3  -c4 


GCX  +  XC3  GC2  •  XC4 


=  tr(GC1+XC3)  +  tr  X'C2 
=  tr  GC  1  +  p(XC3)  +  p(X'C2) 
=  tr  GCX  +  p(X)  +  p(X') 
since  XC3  and  X’C2  are  idempotent.  But 


(1.15) 


G  X 


X  0 


C1  C2 


c3  -c4 


'  G  X  \ 

=  P  =  p(G:X)+ p(X).  (1.16) 

lx’  0 


Equating  (1.15)  and  (1.16),  we  have  (1.9)  and  Lemma  2  is  proved.  Now  we 
prove  the  main  lemma. 


Lemma  3  Let  G  and  X  be  as  in  Lemma  2,  g  €  R(G  :  X)  and  p  e  R(X’). 


Then 


min  (a  G  a  +  2  a'  g)  =  a  *  G  a^+la^g 
X  a  =  p 


(1.17) 


where  a*  is  any  solution  to 


G  a  +  X  b  =  -g 
X  a  =  p. 


(1.18) 


With  Cj,  C2,  C3,  C4  as  defined  in  Lemma  2,  one  choice  of  the  solution  for 


5 wwxw 
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(a,  b)  of  (1.18)  is 


a*  =  -C1g+C2p,  b*=-C3g-C4p  (1.19) 

giving  the  expressions  for  the  minimum  in  (1.17)  as 

g'  a*  -  p  b*  =  -  g’Cj  g  +  g’(C2  +  C3)p  +  p'C4  p.  (1.20) 

Proof  Let  a*,  b  *  be  any  solution  of  (1.18),  and  Z  =  X  1 .  Then 
multiplying  the  first  equation  of  (1.18),  with  (a,  b)  replaced  by  (a*,  b*),  by 

i 

Z  and  a  *  we  obtain 

Z  G  a*  +  Z  g  =  o  (1.21) 

a  *G  a*  +  a  *g  =  -  b  ^p  (1.22) 

A  general  solution  of  X  a  =  p  is  a#  +  Z  d  where  d  is  arbitrary.  Then  writing 
a  =  a+  +  Z  d, 

a'G  a  +  2  a  g  =  a  *  G  a*  +  2  a  *  g  +  d  Z  G  Z  d  +  2d  (Z  G  a ^  +  Z  g) 

=  a  t  G  a^  +  2  a  *  g  +  d'Z’G  Z  d,  using  (1.21) 

»  iii 

=  a  *Ga*  +  2  a*g=a*g-bj(cp,  using  (1.22) 

which  proves(1.17).  The  result  (1.20)  is  obtained  by  substituting  the 
expressions  (1.19)  in  (1.22). 

Lemma  3  plays  a  crucial  role  in  estimation  and  prediction  problems  in 
linear  models.  The  results  are  given  in  Section  2. 

Lemma  4.  If  Y  is  an  n-vector  random  variable  with  E(Y)  =  o  and  D(Y)  = 
cr2  G,  then  an  unbiased  estimator  of  cr2  is 


7 

a  ;  Y  =p'C  2Y 
with  the  minimum  variance 

a2  a  *  G  a*  =  -  a2b  *p  =  a2p'C4p  (2.6) 

using  (1.20)  and  the  expressions  for  a*  and  b*  in  (1.19). 

2.2  Estimation  (Prediction)  of  £ 

Consider  a  linear  function  q  £  of  £  and  let  a  Y,  with  E(a'Y)  =  o  =>  a  X  = 
o,  be  its  predictor.  Then  the  mean  square  error  of  prediction  is 

E(q’£  -  a'Y)2=  E(q'£  -  a’£)^ 

=  a2(a  G  a  -  2a'Gq  +  q  Gq).  (2.7) 

Applying  (1.19)  with  g  =  -Gq  and  p  =  o,  the  minimum  of  (2.7)  is  attained 
when  the  predictor  is 

a  iY  =  -(C,g)'Y  =  q'GC  |Y  =q'  t.  (2.8) 

The  minimum  mean  square  error  of  prediction  is,  using  (1.19)  and  (1.20), 
a2(a  *g  -  b  *p  +  q  Gq) 

=  a2(-q'GC  | Gq  +  qGq)  =  CT2q’(G  -GC  |G)q.  (2.9) 

The  results  (2.8)  and  (2.9)  which  hold  for  any  q  imply  that  the  minimum 
dispersion  error  predictor  of  £  is 


(2.10) 


with 


£=GC  Jy  =  GC  i  Y 
D(  £  -  £)  =  0- 2(G  -  GCjG) 


(2.11) 


.V.'tl.V.'l 


>  "J*  V  v 


2.3  Estimation  of  a2 


Now 


E  (  t)  =  E(GCiY)  =  GCiXp  =  o,  by  (1.6) 

D  (  t)  =  D(GCiY)  =  a2GCjGC  J  G 

=  a2GCiG,  by  (1.7). 

Then  using  Lemma  4,  an  unbiased  estimator  of  a2  is 

£’(GCiG)-  £  =  YCiG(GCiG)GC]Y 


(2.12) 


with  a  suitable  divisor.  Note  that  Y  £R(G:X),  so  that  Y  -  G\  +  )  for  a 
suitable  X  and  \x.  Then 

Y'CiG(GCiG)-GCiY  =  X'(GCiG)(GCiG)-(GCiG)X.  (2.13) 

Since  the  terms  in  X  vanish,  using  (1.6).  Hence,  by  the  definition  of 
g-inverse,  (2.13)  reduces  to 


X'GCiGX  =  (|a'X'  +  X’G)  Ci(Xp.  +  GX),  using  (1.6) 
=  Y  CiY. 


(2.14) 


Now 


E(Y'CiY)  =  tr  E(YY'Ci)  =  a2  tr(G  +Xpp'X’)Ci 

=  a2  tr  GC i  +  a2  tr  pp’X’C’X  =  tr  GCi,  using  (1.8) 


=  a2  tr  GCi  =  a2[p(G :X)  -  p(X)],  using  (1.9). 


Then  an  unbiased  estimator  of  a2  is 

a  Y’CiY 

p(G  :X)-p(X) 


(2.15) 


■'  A-V'oV-.-X-:  eVVVVVV' 


2.4  Normal  equations 

The  expressions  for  the  estimates  of  p'|3,  o2  and  6  obtained  in  sections 
2.1  -  2  3  suggest  a  more  direct  way  of  obtaining  them  by  first  solving  the 
consistent  set  of  equations 


Ga  +  xp  =  Y 
X'a  =o. 


(2.16) 


If  (  a ,  g)  is  a  solution  of  (2.6),  then  we  have  the  following: 

(i)  The  BLUE  of  an  estimable  function  p  £  is  p  $. 

(ii)  The  minimum  dispersion  error  predictor  of  £  is  &  -  G  a  . 

(iii)  An  unbiased  estimator  of  o2  is  a 2  =  Y  a/[p(G:X)  -  p(X)]. 

We  may  call  (2.16)  as  the  generalized  normal  equations  for  the  simultaneous 
estimation  of  6  and  (3. 

If  a  and  $  are  obtained  through  a  g-inverse  as  defined  in  (2.2), 
then  we  automatically  have  the  expressions  for  the  precisions  of  the 
estimates: 

V (P  0)  =  O2  p  C4p 
D(  &  -  £)  =  cj2(G  -GCiG). 

When  G'^  exists,  we  can  write  the  equations  (2.16)  in  terms  of  the 
unknowns  £  and  (3  to  be  estimated  in  the  form 

6  +  X(3  =  Y 


X’G  1  £ 


(2.17) 


using  the  relationship  £  =  G  a .  Thus,  the  equations  (2.17)  are  the 
appropriate  normal  equations  for  £  and  (3  when  G‘*  exists.  In  such  a  case. 


eliminating  £  in  the  second  equation  using  the  first  equation  in  (2.17),  we 
have 

XG-1Xp  =  XG  *Y  (2.18) 

which  is  the  usual  normal  equation  for  (3  only.  If  $  is  a  solution  of  (2.18) 

then  from  (2.17),  £  =  Y  -  X  $  the  usual  residual. 

The  equation  (2.16),  however,  is  a  more  natural  one  which  is  simple  to 
set  up  without  any  initial  computations  and  which  does  not  involve  any 
assumptions  on  the  ranks  of  the  matrices  involved. 

2.5  Projection  operator 

The  second  normal  equation  of  (2.16)  implies  that  a  =Z6  where  Z  =X± 

and  8  is  arbitrary.  Substituting  for  a  in  the  first  normal  equation  of  (2.16), 

XP  +  GZ  6  =  Y  (2.19) 

which  provides  the  decomposition  of  the  observed  Y  as  'signal  +  noise’  giving 
estimates  of  Xp  and  £.  Note  that  R(GZ)  and  R(X)  are  disjoint  and  Y  eR(G:X)  = 
R(GZ:X)  w.p.l.  Hence  the  decomposition  (2.19)  is  unique.  If  $  and  £  is  a 
solution  of  (2.19),  then  an  estimate  of  p  (3  is  p  $  and  of  £  is  £  =  GZ  &. 

Since  R(X)  and  R(GZ)  are  disjoint,  although  R(GZ:X)  may  not  span  the 
whole  of  Rn,  there  exist  projection  operators  Px  and  PqZ  onto  R(X)  and 

R(GZ)  in  terms  of  which  Y  can  be  decomposed  as  in  (2.19).  Then 
X  £  =  PXY  and  6  =  GZ  $  =  PGZY  =  (I-PX)Y. 

Rao  (1979)  gives  a  detailed  discussion  of  generalized  projection  operators. 

3.  MIXED  EFFECTS  LINEAR  MODEL 
The  mixed  effects  linear  model  is  of  the  form 


Y  =  Xp  +  U£  +  £ 


with 

E(£)  =  o,  E(£)  =  Ay 


D( 6 )  =  G ,  Cov(£,  £)  =  o,  D(£)  =  T. 


We  write  the  model  in  an  alternative  form 


(3.1) 


where 


Y  =  +  UT|  +  6 


X*  =  (X:UA),  p'=(p’:y'),  T|  =  £  -  A y. 


(3.2) 


3.1  Estimation  of  a  mixed  effect 

Let  p  P*  +  q'r|  be  a  mixed  effect  to  be  estimated.  If  c  +  a  Y  is  an 
unbiased  estimator,  then 


E(c+a  Y-p  P*-q  rj)  =  o  =»  c  =  o  and  X  *a  =  p. 


The  mean  square  error  is 

E[a'(U T|+  £)  -  q'T)]2 

=  a'G^a  -  2a  Urq  +  qTq 

where  G*  =  UTU  +  G.  Applying  Lemma  3,  the  optimum  a  is  a  solution  of 
the  equation 


(3.3) 


If 


G*a  +  X*  b  =  UTq 


X*a 


=  P 


(  G. 


v  ' 

V  A* 


0 


J 


f  o 

c2 

,  C3 

-c4 

best 

linear 

(3.4) 


where 


a*  =  CiUTq  +  C2P 


(3.5) 


and  the  mean  square  error  is,  using  (1.19)  and  (1.20), 

-  a  *Ufq  -  b  *p  +  qTq 


=  q'(T  -  ru'c  JlJOq  +  p'C4p  -p(C  2  +  C  ^UTq.  (3.6) 


Writing  p  =  (p  j ,  p  2)» 

p’P*  +  q'Tl  =  P  J  p  +  P  27  +  q'Tl  =  P  J  p  +  (P  2-q'A)7  +  q’l  (3.7) 

we  find  that  the  formulas  (3.5)  and  (3.6)  cover  all  special  cases  of  linear 
functions  involving  one  or  more  of  the  parameters  (3  and  y  and  the  random 
variable 

3.2  Estimation  of  the  random  error 

Let  r'S  be  a  linear  function  of  the  random  error  estimated  by  a  Y.  The 
condition  of  unbiasedness  implies  that  a'X*  =  o.  The  mean  square  error  is 


as 


E(a'Y-r’£)z  =  a’G*a  -  2a  Gr  +  r  Gr. 


(3.8) 


Applying  Lemma  3,  the  minimum  of  (3.8)  is  attained  when  the  estimator  of 
« 

r’£  is  a  where  a*  =CiGr.  The  minimum  mean  square  error  is  using 
(1-19), 


r  (G-GC  1  G  )r. 


(3.9) 


From  the  above  expressions  it  follows  that  the  minimum  dispersion  error 
estimate  of  6  is 


£  =  GCjY  with  D(  £  -£)  =  G  -  GCjG. 


(3.10) 


3.3  Normal  equations 

The  expressions  for  the  estimators  obtained  in  Sections  3.1  and  3.2 
suggest  the  following  estimation  procedure.  We  set  up  the  generalized 
normal  equations 


X  U  A 


X’  0  0 


A '  U '  0  0 


(3.11) 


and  obtain  a  solutuion  a,  $  and  y .  Then  the  estimate  of  7)  is  r\  =  TU'  ct 

A  A  »  I  ' 

and  of  £  is  £  =  G  a .  The  estimate  of  p  {3  +  p  2  y  when  estimable  is  p  ^ 

£  +  p  2  y  • 

Denoting  (X:UA)  =  X)(t  and  (5  *  =  (P',  y),  the  equations  (3.10)  can  be 
written  as 


1  4 


(G  +UTU’)  a  +  X+p+  =  Y 


x  *  a 


=  o. 


(3.12) 


11  ‘1 
If  G  and  T  exist,  then  multiplying  the  first  equation  by  X  G  and 


r^rU'G'1  and  using  the  second  equation,  we  obtain  the  two  equations 


x’g^Uti  +  X  '  G4Xp+  =X  '  G'1  Y 


(T1  +  U’G^lDri  +  U’G^Xp.  =  UG^Y. 


(3.13) 


Henderson  (1984)  derived  equations  of  the  type  (3.13)  when  A  =  o.  (See  also 
Harville,  1976).  The  equations  (3.12)  provide  estimators  of  T)  and  p  directly. 

When  G  and  T  are  not  both  non-singular  or  when  G  and/or  T  is  a 
complicated  matrix,  other  methods  of  solving  the  equations  (3.11)  could  be 
explored. 

The  estimators  of  £,  €,  p  and  y  obtained  in  Sections  3. 1-3.3  involve 
the  matrices  G  and  f  which  may  not  be  known.  In  the  simplest  possible  case 


G  and  T  may  be  of  the  form  O  |v  !  and  O  ^V2  respectively,  where  Vj  and 


?  7 

V2  are  known  and  a  |  and  <7  |  are  unknown  variance  components.  In  such 


a  case,  o  j  and  a  j,  may  have  to  be  estimated  using  techniques  such  as 


the  MINQE  or  maximum  likelihood  as  described  in  Rao  and  Kleffe  (1988). 
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The  estimates  of  a  |  and  a  %  may  be  substituted  for  the  unknown  values 


in  the  expressions  for  the  estimators  of  6,  P  and  y 
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